Catch trial performance

## [1] "Excluded 1 participants based on catch-trial performance."

Exclusion of random guesses

We further exclude participants who seem to provide random ratings independent of the scene that they are seeing. We quantify this by computing the mean rating for each utterance across all trials for each participant and computing the correlation between a participant’s actual ratings and their mean rating. A high correlation is unexpected and indicates that a participant chose ratings at random. We therefore also exclude the data from participants for whom this correlation is larger than 0.75.

## `summarise()` has grouped output by 'modal'. You can override using the
## `.groups` argument.
## `summarise()` has grouped output by 'modal', 'percentage_blue'. You can
## override using the `.groups` argument.
## [1] "Excluded 0 participants based on random responses."

Aggregated results

## `summarise()` has grouped output by 'workerid', 'percentage_blue', 'modal'. You
## can override using the `.groups` argument.
## `summarise()` has grouped output by 'percentage_blue', 'modal'. You can
## override using the `.groups` argument.
## `summarise()` has grouped output by 'workerid', 'percentage_blue', 'modal'. You
## can override using the `.groups` argument.

Comparison across conditions

## `summarise()` has grouped output by 'workerid', 'percentage_blue', 'modal'. You
## can override using the `.groups` argument.
## `summarise()` has grouped output by 'percentage_blue', 'modal'. You can
## override using the `.groups` argument.

Individual responses

AUC computation

We use the AUC function with the splines method to directly compute the AUC.

t-test and regression model with control variables:

## 
##  Two Sample t-test
## 
## data:  aucs.cautious$auc_diff and aucs.confident$auc_diff
## t = 3.1769, df = 122, p-value = 0.001886
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   4.61496 19.87576
## sample estimates:
## mean of x mean of y 
## 21.200203  8.954844
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: 
## auc_diff ~ cond + test_order + first_speaker_type + confident_speaker +  
##     first_speaker_type * cond + test_order * cond + (1 | workerid)
##    Data: auc_d
## 
## REML criterion at convergence: 1072.4
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.4895 -0.5075  0.1042  0.5968  1.8987 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  workerid (Intercept)  47.45    6.888  
##  Residual             375.41   19.376  
## Number of obs: 124, groups:  workerid, 62
## 
## Fixed effects:
##                           Estimate Std. Error     df t value Pr(>|t|)    
## (Intercept)                 14.817      1.950 58.000   7.600 2.87e-10 ***
## cond1                        6.162      1.741 59.000   3.540 0.000789 ***
## test_order1                  1.931      1.952 58.000   0.989 0.326629    
## first_speaker_type1         -6.627      1.950 58.000  -3.399 0.001228 ** 
## confident_speaker1           1.440      1.954 58.000   0.737 0.464164    
## cond1:first_speaker_type1    1.225      1.741 59.000   0.704 0.484333    
## cond1:test_order1           -1.909      1.740 59.000  -1.097 0.277067    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) cond1  tst_r1 frs__1 cnfd_1 c1:__1
## cond1        0.000                                   
## test_order1  0.002  0.000                            
## frst_spkr_1  0.033  0.000  0.002                     
## cnfdnt_spk1 -0.033  0.000 -0.065 -0.033              
## cnd1:frs__1  0.000  0.032  0.000  0.000  0.000       
## cnd1:tst_r1  0.000  0.000  0.000  0.000  0.000  0.000

Clustering analyses

library(mclust)
## Package 'mclust' version 5.4.10
## Type 'citation("mclust")' for citing this R package in publications.
## 
## Attaching package: 'mclust'
## The following object is masked from 'package:DescTools':
## 
##     BrierScore
## The following object is masked from 'package:bootstrap':
## 
##     diabetes
aucs_diff = merge(aucs.cautious, aucs.confident, by=c("workerid"))
aucs_diff$diff_of_diffs = aucs_diff$auc_diff.x - aucs_diff$auc_diff.y

aucs_diff %>% ggplot(aes(x=diff_of_diffs)) + geom_density() + geom_jitter(aes(y=0), width=0, height=0.001)  + ggtitle("Raw data + estimated density")

Gaussian mixture models of diffeences of AUC differences

1 Cluster

fit1 = Mclust(aucs_diff$diff_of_diffs, G=1)
print(summary(fit1, parameters=2))
## ---------------------------------------------------- 
## Gaussian finite mixture model fitted by EM algorithm 
## ---------------------------------------------------- 
## 
## Mclust X (univariate normal) model with 1 component: 
## 
##  log-likelihood  n df       BIC       ICL
##       -292.5732 62  2 -593.4007 -593.4007
## 
## Clustering table:
##  1 
## 62 
## 
## Mixing probabilities:
## 1 
## 1 
## 
## Means:
## [1] 12.24536
## 
## Variances:
## [1] 735.0717

2 Clusters

fit2 = Mclust(aucs_diff$diff_of_diffs, G=2)
print(summary(fit2, parameters=T))
## ---------------------------------------------------- 
## Gaussian finite mixture model fitted by EM algorithm 
## ---------------------------------------------------- 
## 
## Mclust E (univariate, equal variance) model with 2 components: 
## 
##  log-likelihood  n df      BIC       ICL
##       -288.3347 62  4 -593.178 -599.2633
## 
## Clustering table:
##  1  2 
## 55  7 
## 
## Mixing probabilities:
##        1        2 
## 0.868004 0.131996 
## 
## Means:
##       1       2 
##  4.7011 61.8563 
## 
## Variances:
##        1        2 
## 360.7939 360.7939

3 Clusters

fit3 = Mclust(aucs_diff$diff_of_diffs, G=3)
print(summary(fit3, parameters=T))
## ---------------------------------------------------- 
## Gaussian finite mixture model fitted by EM algorithm 
## ---------------------------------------------------- 
## 
## Mclust E (univariate, equal variance) model with 3 components: 
## 
##  log-likelihood  n df       BIC       ICL
##       -288.3427 62  6 -601.4481 -656.1567
## 
## Clustering table:
##  1  2  3 
## 10 45  7 
## 
## Mixing probabilities:
##         1         2         3 
## 0.3358024 0.5359504 0.1282473 
## 
## Means:
##         1         2         3 
## -1.157268  8.579513 62.658483 
## 
## Variances:
##        1        2        3 
## 341.6108 341.6108 341.6108

According to the Bayesian information criterion, a model with two clusters describes the data best.

Fitted model:

aucs_diff %>% 
  ggplot(aes(x=diff_of_diffs)) + 
    geom_jitter(aes(y=0, color=first_speaker_type.x), width=0, height=0.001)  +
    ggtitle("Raw data + Components of gaussian mixture") + 
    stat_function(fun = dnorm, args = list(mean = fit2$parameters$mean[1], sd = sqrt(fit2$parameters$variance$sigmasq[1]))) + 
    stat_function(fun = dnorm, args = list(mean = fit2$parameters$mean[2], sd = sqrt(fit2$parameters$variance$sigmasq[2])))
## Warning: Removed 101 row(s) containing missing values (geom_path).

Compute likelihoods based on the adaptation model

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: most_likely_model ~ condition + test_order + first_speaker_type +  
##     first_speaker_type * condition + test_order * condition +  
##     (1 | workerid)
##    Data: d.post_test
## 
##      AIC      BIC   logLik deviance df.resid 
##    150.7    170.4    -68.4    136.7      115 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.4987 -0.5254 -0.2800  0.4157  2.1083 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  workerid (Intercept) 2.103    1.45    
## Number of obs: 122, groups:  workerid, 61
## 
## Fixed effects:
##                                              Estimate Std. Error z value
## (Intercept)                                   -0.4898     0.3292  -1.488
## conditioncautious                             -0.9138     0.3080  -2.966
## test_orderparallel                            -0.5298     0.3271  -1.620
## first_speaker_typecautious                     0.8555     0.3556   2.406
## conditioncautious:first_speaker_typecautious  -0.2591     0.2529  -1.025
## conditioncautious:test_orderparallel           0.3858     0.2579   1.496
##                                              Pr(>|z|)   
## (Intercept)                                   0.13684   
## conditioncautious                             0.00301 **
## test_orderparallel                            0.10528   
## first_speaker_typecautious                    0.01614 * 
## conditioncautious:first_speaker_typecautious  0.30548   
## conditioncautious:test_orderparallel          0.13474   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) cndtnc tst_rd frst__ cnd:__
## conditincts  0.212                            
## tst_rdrprll  0.105  0.230                     
## frst_spkr_t -0.184 -0.361 -0.205              
## cndtncts:__ -0.011  0.129  0.124 -0.105       
## cndtncts:t_ -0.063 -0.215 -0.100  0.196 -0.142
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: likelihood_ratio ~ condition + test_order + first_speaker_type +  
##     first_speaker_type * condition + test_order * condition +  
##     (1 | workerid)
##    Data: d.post_test
## 
## REML criterion at convergence: 1562.1
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.30287 -0.60184 -0.09938  0.51907  2.78020 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  workerid (Intercept)  1410     37.55  
##  Residual             30827    175.58  
## Number of obs: 122, groups:  workerid, 61
## 
## Fixed effects:
##                                              Estimate Std. Error     df t value
## (Intercept)                                    -17.45      16.61  58.00  -1.050
## conditioncautious                              -57.22      15.90  58.00  -3.598
## test_orderparallel                             -28.19      16.61  58.00  -1.697
## first_speaker_typecautious                      53.61      16.61  58.00   3.227
## conditioncautious:first_speaker_typecautious   -13.92      15.90  58.00  -0.876
## conditioncautious:test_orderparallel            14.45      15.90  58.00   0.909
##                                              Pr(>|t|)    
## (Intercept)                                  0.297866    
## conditioncautious                            0.000663 ***
## test_orderparallel                           0.095113 .  
## first_speaker_typecautious                   0.002058 ** 
## conditioncautious:first_speaker_typecautious 0.384790    
## conditioncautious:test_orderparallel         0.367239    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) cndtnc tst_rd frst__ cnd:__
## conditincts  0.000                            
## tst_rdrprll -0.016  0.000                     
## frst_spkr_t  0.016  0.000  0.016              
## cndtncts:__  0.000  0.016  0.000  0.000       
## cndtncts:t_  0.000 -0.016  0.000  0.000  0.016
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: likelihood_ratio ~ condition + test_order + first_speaker_type +  
##     prior_likelihood_ratio + first_speaker_type * condition +  
##     test_order * condition + (1 | workerid)
##    Data: d.post_test
## 
## REML criterion at convergence: 1553.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.4754 -0.6991 -0.0235  0.4623  2.9507 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  workerid (Intercept)     0      0     
##  Residual             29572    172     
## Number of obs: 122, groups:  workerid, 61
## 
## Fixed effects:
##                                              Estimate Std. Error       df
## (Intercept)                                   13.2128    18.0165 115.0000
## conditioncautious                            -57.2157    15.5731 115.0000
## test_orderparallel                           -22.6574    15.6585 115.0000
## first_speaker_typecautious                    47.4620    15.6786 115.0000
## prior_likelihood_ratio                         0.4017     0.1187 115.0000
## conditioncautious:first_speaker_typecautious -13.9242    15.5731 115.0000
## conditioncautious:test_orderparallel          14.4494    15.5731 115.0000
##                                              t value Pr(>|t|)    
## (Intercept)                                    0.733 0.464823    
## conditioncautious                             -3.674 0.000364 ***
## test_orderparallel                            -1.447 0.150624    
## first_speaker_typecautious                     3.027 0.003047 ** 
## prior_likelihood_ratio                         3.385 0.000976 ***
## conditioncautious:first_speaker_typecautious  -0.894 0.373126    
## conditioncautious:test_orderparallel           0.928 0.355433    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) cndtnc tst_rd frst__ prr_l_ cnd:__
## conditincts  0.000                                   
## tst_rdrprll  0.039  0.000                            
## frst_spkr_t -0.044  0.000  0.004                     
## prr_lklhd_r  0.503  0.000  0.104 -0.116              
## cndtncts:__  0.000  0.016  0.000  0.000  0.000       
## cndtncts:t_  0.000 -0.016  0.000  0.000  0.000  0.016
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
## Data: d.post_test
## Models:
## model1: likelihood_ratio ~ condition + test_order + first_speaker_type + first_speaker_type * condition + test_order * condition + (1 | workerid)
## model2: likelihood_ratio ~ condition + test_order + first_speaker_type + prior_likelihood_ratio + first_speaker_type * condition + test_order * condition + (1 | workerid)
##        npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## model1    8 1622.4 1644.8 -803.21   1606.4                         
## model2    9 1613.0 1638.2 -797.48   1595.0 11.468  1   0.000708 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

List of adapters:

workerid first_speaker_type test_order noticed_manipulation cautious_count confident_count aligned_count first_adaptation_speaker_count
1352 cautious reverse 1 1 1 2 1
1353 confident parallel 0 1 1 2 1
1354 cautious parallel 0 1 1 2 1
1355 confident reverse 0 1 1 2 1
1359 confident reverse 0 1 1 2 1
1365 cautious reverse 0 1 1 2 1
1368 confident parallel 1 1 1 2 1
1370 confident parallel 1 1 1 2 1
1373 cautious reverse 1 1 1 2 1
1375 confident reverse 1 1 1 2 1
1385 cautious reverse 1 1 1 2 1
1387 cautious reverse 1 1 1 2 1
1391 cautious reverse 1 1 1 2 1
1395 cautious parallel 1 1 1 2 1
1397 cautious reverse 1 1 1 2 1
1407 confident reverse 0 1 1 2 1
1417 cautious reverse 1 1 1 2 1
1423 cautious parallel 1 1 1 2 1
1428 cautious parallel 1 1 1 2 1
1432 confident reverse 0 1 1 2 1

List of reverse adapters:

workerid first_speaker_type test_order noticed_manipulation cautious_count confident_count aligned_count first_adaptation_speaker_count
1404 confident parallel 1 1 1 0 1
1406 cautious parallel 0 1 1 0 1
1419 confident reverse 0 1 1 0 1